Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 44
Filter
1.
International Journal of Emerging Technologies in Learning ; 18(10):184-203, 2023.
Article in English | Scopus | ID: covidwho-20237547

ABSTRACT

During the COVID-19 Pandemic, many universities in Thailand were mostly locked down and classrooms were also transformed into a fully online format. It was challenging for teachers to manage online learning and especially to track student behavior since the teacher could not observe and notify students. To alleviate this problem, one solution that has become increasingly important is the prediction of student performance based on their log data. This study, therefore, aims to analyze student behavior data by applying Predictive Analytics through Moodle Log for approximately 54,803 events. Six Machine Learning Classifiers (Neural Network, Random Forest, Decision Tree, Logistic Regression, Linear Regression, and Support Vector Machine) were applied to predict student performance. Further, we attained a comparison of the effectiveness of early prediction for four stages at 25%, 50%, 75%, and 100% of the course. The prediction models could guide future studies, motivate self-preparation and reduce dropout rates. In the experiment, the model with 5-fold cross-validation was evaluated. Results indicated that the Decision Tree performed best at 81.10% upon course completion. Meanwhile, the SVM had the best result at 86.90% at the first stage, at 25% of the course, and Linear Regression performed with the best efficiency at the middle stages at 70.80%, and 80.20% respectively. The results could be applied to other courses and on a larger e-learning systems log that has similar student activity conditions and this could contribute to more accurate student performance prediction © 2023, International Journal of Emerging Technologies in Learning.All Rights Reserved.

2.
2022 IEEE Creative Communication and Innovative Technology, ICCIT 2022 ; 2022.
Article in English | Scopus | ID: covidwho-20237219

ABSTRACT

Covid-19 emerged as a pandemic outbreak that spread almost worldwide at the end of December 2019. While this research was carried out, the Covid-19 pandemic was still ongoing. Many countries have made various attempts to overcome Covid-19. In Indonesia, the government and stakeholders, including researchers, have made many activities to reduce the number of positive patients. One of many activities that the government made is the vaccination program. The vaccination program is believed to be the most effective in reducing the number of positive cases of Covid-19. But nobody knows when the Covid-19 pandemic will end. Stakeholder has to know how the trend of Covid-19 cases in Indonesia to make a better decision for facing Covid-19 cases. This study aims to predict the number of positive Covid-19 cases in Indonesia by conducting a comparative analysis performance of Support Vector Regression (SVR) method and Long Short-Term Memory (LSTM) method in machine learning to the prediction of the number of Covid-19 cases. This study was conducted using the dataset Covid-19 in Indonesia from Control Team from 13 January 2021 until 08 November 2021 and with 300 records. The evaluation has been conducted to know the performance of the model prediction number of Covid-19 with Support Vector Regression method and Long Short-Term Memory method based on values of R-Square (R2), the value of Mean Absolute Error (MAE) and Mean Square Error (MSE). The research found that the method Support Vector Regression has better performance than Long Short-Term Memory method for making a prediction of the number Covid-19 using Machine Learning model based on the value of accuracy and error rate based with the value of R-Squared, MAE, and MSE are consecutively 0.902, 0.163, and 0.072. © 2022 IEEE.

3.
Proceedings of the 17th INDIACom|2023 10th International Conference on Computing for Sustainable Global Development, INDIACom 2023 ; : 1096-1100, 2023.
Article in English | Scopus | ID: covidwho-20235056

ABSTRACT

Covid-19 eruption and lockdown situation have increased the usages of online platforms which have impacted the users. Cyberbullying is one of the negative outcomes of using social media platforms which leads to mental and physical distress. This study proposes a machine learning-based approach for the detection of cyberbullying in Hinglish text. We use the Hinglish Code-Mixed Corpus, which consists of over 6,000 tweets, for our experiments. We use various machine learning algorithms, including Logistic regression (LR), Multinomial Naive Bayes (MNB), Support vector machine (SVM), Random Forest (RF), to train our models. We evaluate the performance of the models using standard evaluation metrics such as precision, recall, and F1-score. Our experiments show that the LR with Term Frequency-Inverse Document Frequency (TFIDF) outperforms the other models, achieving 92% accuracy. Our study demonstrates that machine learning models can be effective for cyberbullying detection in Hinglish text, and the proposed approach can help identify and prevent cyberbullying on social media platforms. © 2023 Bharati Vidyapeeth, New Delhi.

4.
International Journal of Engineering Business Management ; 15, 2023.
Article in English | Web of Science | ID: covidwho-2323009

ABSTRACT

Flight demand forecasting is a particularly critical component for airline revenue management because of the direct influence on the booking limits that determine airline profits. The traditional flight demand forecasting models generally only take day of the week (DOW) and the current data collection point (DCP) adds up bookings as the model input and uses linear regression, exponential smoothing, pick-up as well as other models to predict the final bookings of flights. These models can be regarded as time series flight demand forecasting models based on the interval between the current date and departure date. They fail to consider the early bookings change features in the specific flight pre-sale period, and have weak generalization ability, at last, they will lead to poor adaptability to the random changes of flight bookings. The support vector regression (SVR) model, which is derived from machine learning, has strong adaptability to nonlinear random changes of data and can adaptively learn the random disturbances of flight bookings. In this paper, flight bookings are automatically divided into peak, medium, and off (PMO) according to the season attribute. The SVR model is trained by using the vector composed of historical flight bookings and adding up bookings of DCP in the early stage of the flight pre-sale period. Compared with the traditional models, the priori information of flight is increased. We collect 2 years of domestic route bookings data of an airline in China before COVID-19 as the training and testing datasets, and divide these data into three categories: tourism, business, and general, the numerical results show that the SVR model significantly improves the forecasting accuracy and reduces RMSE compared with the traditional models. Therefore, this study provides a better choice for flight demand forecasting.

5.
AEU - International Journal of Electronics and Communications ; : 154723, 2023.
Article in English | ScienceDirect | ID: covidwho-2321722

ABSTRACT

Wireless body area networks (WBANs) are helpful for remote health monitoring, especially during the COVID-19 pandemic. Due to the limited batteries of bio-sensors, energy-efficient routing is vital to achieve load-balancing and prolong the network's lifetime. Although many routing techniques have been presented for WBANs, they were designed for an application, and their performance may be degraded in other applications. In this paper, an ensemble Metaheuristic-Driven Machine Learning Routing Protocol (MDML-RP) is introduced as an adaptive real-time remote health monitoring in WBANs. The motivation behind this technique is to utilize the superior route optimization solutions offered by metaheuristics and to integrate them with the real-time routing capability of machine learning. The proposed method involves two phases: offline model tuning and online routing. During the offline pre-processing step, a metaheuristic algorithm based on the whale optimization algorithm (WOA) is used to optimize routes across various WBAN configurations. By applying WOA for multiple WBANs, a comprehensive dataset is generated. This dataset is then used to train and test a machine learning regressor that is based on support vector regression (SVR). Next, the optimized MDML-RP model is applied as an adaptive real-time protocol, which can efficiently respond to just-in-time requests in new, previously unseen WBANs. Simulation results in various WBANs demonstrate the superiority of the MDML-RP model in terms of application-specific performance measures when compared with the existing heuristic, metaheuristic, and machine learning protocols. The findings indicate that the proposed MDML-RP model achieves noteworthy improvement rates across various performance metrics when compared to the existing techniques, with an average improvement of 42.3% for the network lifetime, 15.4% for reliability, 31.3% for path loss, and 31.7% for hot-spot temperature.

6.
ASEAN Engineering Journal ; 13(1):21-25, 2023.
Article in English | Scopus | ID: covidwho-2305022

ABSTRACT

Covid-19 virus is threatening the world with health, social and economic implications and all around the world data is obtained continuously with pandemic for modelling and predicting the future. In this work, support vector regression technique was used to make some predictions on the daily death values due to Covid-19 virus. The models were created for the world, United States of America, United Kingdom and Turkey. All the regression models were tested using coefficient of determination (R2) and root mean square error (RMSE) values. The analysis was also conducted for comparing the suitability of linear, radial and polynomial kernels. The radial kernel produced relatively better results. In predicting the world data support vector regression with radial kernel produced 0.805262 R2 value on test data. In the models created for United States of America 0.723376 R2 value, for United Kingdom 0.95412 R2 value and for Turkey 0.875343 R2 value using test data were observed. Also, while the models were created for specific countries the comparisons were made between using only data for the country and also using the whole world data. In general modelling using the data for the world combined with the country data gave better prediction. © 2023 Penerbit UTM Press. All rights reserved.

7.
Springer Proceedings in Mathematics and Statistics ; 414:123-134, 2023.
Article in English | Scopus | ID: covidwho-2304950

ABSTRACT

Public opinions shared in common platforms like Twitter, Facebook, Instagram, etc. act as the sources of information for experts. Transportation and analysis of such data is very important and difficult due to data regulations and its structure. The pre-processing approaches and word-based dictionaries are used to understand the unprocessed data and make possible the opinions/tweets to be analyzed. Machine learning algorithms learn from past experience and use a variety of statistical, probabilistic and optimization algorithms to detect useful patterns from unstructured data sets. Our study aims to compare the performance of classification algorithms to predict individuals with COVID-19(+ ) or COVID-19(−) using the emotions among the tweets by text mining procedures. Logistic Regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Decision Trees (DT), Random Forest (RF), Artificial Neural Networks (ANN), Gradient Boost (GBM) and XGradient algorithms were used to extract the accuracy of model performance of each model for the detection and identification of the disease related to the COVID-19 virus, which has been on the agenda recently. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

8.
Healthcare Analytics ; 2 (no pagination), 2022.
Article in English | EMBASE | ID: covidwho-2303536

ABSTRACT

This study aims to (1) correlate and visualise the Coronavirus disease 19 (COVID-19) pandemic spread via Spearman rank coefficients of network analysis (NA) and (2) predict the cumulative number of COVID-19 confirmed and death cases via support vector regression (SVR) based on COVID-19 dataset in Malaysia between July 2020 to June 2021. The NA indicated increasing connectivity between different states throughout the time frame, revealing the most complex network of COVID-19 transmission in the second quarter of 2021. The SVR model predicted future COVID-19 cases and deaths in Malaysia in the second half of 2021. The study demonstrated that the NA and SVR could provide relatively simple yet valuable artificial intelligence techniques for visualising the degree of connectivity and predicting pandemic risk based on confirmed COVID-19 cases and deaths. The Malaysian health authorities used the NA and SVR model results for preventive measures in highly populated states.Copyright © 2022 The Author(s)

9.
Lecture Notes on Data Engineering and Communications Technologies ; 165:465-479, 2023.
Article in English | Scopus | ID: covidwho-2296443

ABSTRACT

Classical statistics are usually based on parametric models, where the performance depends heavily on assumptions and is not robust in the presence of outliers in the data. Due to the COVID-19 pandemic, our daily lives have changed significantly, including slowing economic growth. These extreme changes can manifest as an outlier in time series studies and adversely affect the results of data analysis. Many classical methods of official statistics are prone to outliers. In this work, we evaluate machine learning methods: Support Vector Regression (SVR) and Random Forest (RF) and compare it with ARIMA to determine the robustness through simulation studies. Robustness is measured by the sensitivity of the SVR and Random Forest hyperparameter and the model's error in the presence of outliers. Simulations show that more outliers lead to higher RMSE values, and conversely, more samples lead to lower RMSE values. The type of outliers significantly impacts the RMSE value of the ARIMA model, where additional outliers (AO) have a worse impact than temporary change (TC). Consecutive outliers produce a smaller RMSE mean than non-consecutive outliers. Based on the sensitivity of hyperparameters, SVR and Random Forest models are relatively robust to the presence of outliers in the data. Based on the simulation results of 100 iterations, we find that SVR is more robust than ARIMA and Random Forest in modeling time series data with outliers. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

10.
3rd International Conference on Mathematics and its Applications in Science and Engineering, ICMASE 2022 ; 414:123-134, 2023.
Article in English | Scopus | ID: covidwho-2284657

ABSTRACT

Public opinions shared in common platforms like Twitter, Facebook, Instagram, etc. act as the sources of information for experts. Transportation and analysis of such data is very important and difficult due to data regulations and its structure. The pre-processing approaches and word-based dictionaries are used to understand the unprocessed data and make possible the opinions/tweets to be analyzed. Machine learning algorithms learn from past experience and use a variety of statistical, probabilistic and optimization algorithms to detect useful patterns from unstructured data sets. Our study aims to compare the performance of classification algorithms to predict individuals with COVID-19(+ ) or COVID-19(−) using the emotions among the tweets by text mining procedures. Logistic Regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Decision Trees (DT), Random Forest (RF), Artificial Neural Networks (ANN), Gradient Boost (GBM) and XGradient algorithms were used to extract the accuracy of model performance of each model for the detection and identification of the disease related to the COVID-19 virus, which has been on the agenda recently. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

11.
International Conference on Cyber Security, Privacy and Networking, ICSPN 2022 ; 599 LNNS:134-149, 2023.
Article in English | Scopus | ID: covidwho-2284531

ABSTRACT

This research develops a COVID-19 patient recovery prediction model using machine learning. A publicly available data of infected patients is taken and pre-processed to prepare 450 patients' data for building a prediction model with 20.27% recovered cases and 79.73% not recovered/dead cases. An efficient logistic regression (ELR) model is built using the stacking of random forest (RF) and logistic regression (LR) classifiers. Further, the proposed model is compared with state-of-art models such as logistic regression (LR), support vector machine (SVM), decision tree (C5.0), and random forest (RF). All the models are evaluated with different metrics and statistical tests. The results show that the proposed ELR model is good in predicting not recovered/dead cases and handling imbalanced data. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

12.
5th International Seminar on Research of Information Technology and Intelligent Systems, ISRITI 2022 ; : 306-312, 2022.
Article in English | Scopus | ID: covidwho-2280614

ABSTRACT

The behavior of shopping has shifted into online shopping. Especially after Coronavirus Disease of 2019 (COVID-19), people choose online shopping rather than going to the market for economic and hygienic reasons. Reviews help the seller to make customers trust their products, but since some sellers are not honest, they use fake reviews to help boost their products. Fake reviews are commonly generated randomly by a computer bot or someone not using the product. Some researchers are already working on fake review detection to help this problem using many methods. In this paper, we compared three supervised machine learning algorithms: Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF). By preprocessing the data and using the Term Frequency-Inverse Document Frequency (TF-IDF) feature, we begin the experiment process without tuning. We apply the tuning parameters to each algorithm for the other experiments using 5-fold cross-validation. The result showed that SVM algorithms outperform the best algorithms of the three before and after tuning, with 88.89% and 89.77%, respectively. © 2022 IEEE.

13.
Am J Infect Control ; 2023 Mar 30.
Article in English | MEDLINE | ID: covidwho-2281266

ABSTRACT

BACKGROUND: This study aims to show that including pairwise hierarchical interactions of covariates and combining forecasts from individual models improves prediction accuracy. METHODS: The least absolute shrinkage and selection operator via hierarchical pairwise interaction is used in selecting variables that are not correlated and with the greatest predictive power in single forecast models (Gradient boosting method [GBM], Generalized additive models [GAMs], Support vector regression [SVR]) are used in the analysis. The best model was selected based on the mean absolute error (MAE), the best key performance indicator for skewed data. Forecasts from the 5 models were combined using linear quantile regression averaging (LQRA). Box and Whiskers plots are used to diagnose the overall performance of fitted models. RESULTS: Single forecast models (GBM, GAMs, and SVRs) show that including pairwise interactions improves forecast accuracy. The SVR model with interactions based on the radial basis kernel function is the best from single forecast models with the lowest MAE. Combining point forecasts from all the single forecast models using the LQRA approach further reduces the MAE. However, based on the Box and Whiskers plot, the SVR model with pairwise interactions has the smallest range. CONCLUSIONS: Based on the key performance indicators, combining predictions from several individual models improves forecast accuracy. However, overall, the SVM with pairwise hierarchical interactions outperforms all the other models.

14.
Math Biosci Eng ; 19(12): 12316-12333, 2022 08 23.
Article in English | MEDLINE | ID: covidwho-2231596

ABSTRACT

Due to the emergence of the novel coronavirus disease, many recent studies have investigated prediction methods for infectious disease transmission. This paper proposes a framework to quickly screen infection control scenarios and identify the most effective scheme for reducing the number of infected individuals. Analytical methods, as typified by the SIR model, can conduct trial-and-error verification with low computational costs; however, they must be reformulated to introduce additional constraints, and thus are inappropriate for case studies considering detailed constraint parameters. In contrast, multi-agent system (MAS) simulators introduce detailed parameters but incur high computation costs per simulation, making them unsuitable for extracting effective measures. Therefore, we propose a framework that implements an MAS for constructing a training dataset, and then trains a support vector regression (SVR) model to obtain effective measure results. The proposed framework overcomes the weaknesses of conventional methods to produce effective control measure recommendations. The constructed SVR model was experimentally verified by comparing its performance on datasets with expected and unexpected outputs. Although datasets producing an unexpected output decreased the prediction accuracy, by removing randomness from the training dataset, the accuracy of the proposed method was still high in these cases. High-precision predictions of the MAS-based simulation output were obtained for both test datasets in under one second of the computational time. Furthermore, the experimental results establish that the proposed framework can obtain intuitively correct outputs for unknown inputs, and produces sufficiently high-precision prediction with lower computation costs than an existing method.


Subject(s)
COVID-19 , Humans , COVID-19/epidemiology
15.
13th International Conference on Computing Communication and Networking Technologies, ICCCNT 2022 ; 2022.
Article in English | Scopus | ID: covidwho-2213235

ABSTRACT

Covid-19 has been found in Wuhan, China, for approximately a year and a half ago, and the virus's origin remains a mystery. However, it has been in the news in recent weeks, with reports suggesting that an infectious disease was spilled in a Chinese laboratory, which was previously refuted by a hoax in the area. In this research paper, we have presented a model where there will be a sentimental analysis based on users' comments on social media about the origin of corona virus. Nowadays most people express their feelings and the truth around them and many lies on social media. And we are taking this opportunity to do a sentimental analysis of the true, false, and confusing feelings that people have expressed on social media about the origin of corona virus. We used 20000 data (comments) taken from corona virus-related popular Facebook news posts. In order to achieve the maximum results, we used five distinct machine learning classifiers, and our support vector machine and logistic regression model outscored them all. The support vector model had a testing accuracy rate of 83.73 %, whereas logistic regression had an accuracy rate of 81.39 %. The important thing about our research is that at the end of the whole work, thousands of people's personal feelings, truths, hesitations, and confusion come together to know a strong possibility about the origin of the corona virus. © 2022 IEEE.

16.
Recent Advances in Electrical and Electronic Engineering ; 15(5):390-400, 2022.
Article in English | Scopus | ID: covidwho-2141271

ABSTRACT

Background: Coronavirus refers to a large group of RNA viruses that infects the respira-tory tract in humans and also causes diseases in birds and mammals. SARS-CoV-2 gives rise to the infectious disease “COVID-19”. In March 2020, coronavirus was declared a pandemic by the WHO. The transmission rate of COVID-19 has been increasing rapidly;thus, it becomes indispensable to estimate the number of confirmed infected cases in the future. Objective: The study aims to forecast coronavirus cases using three ML algorithms, viz., support vector regression (SVR), polynomial regression (PR), and Bayesian ridge regression (BRR). Methods: There are several ML algorithms like decision tree, K-nearest neighbor algorithm, Ran-dom forest, neural networks, and Naïve Bayes, but we have chosen PR, SVR, and BRR as they have many advantages in comparison to other algorithms. SVM is a widely used supervised ML algorithm developed by Vapnik and Cortes in 1990. It is used for both classification and regression. PR is known as a particular case of Multiple Linear Regression in Machine Learning. It models the rela-tionship between an independent and dependent variable as nth degree polynomial. Results: In this study, we have predicted the number of infected confirmed cases using three ML algorithms, viz. SVR, PR, and BRR. We have assumed that there are no precautionary measures in place. Conclusion: In this paper, predictions are made for the upcoming number of infected confirmed cases by analyzing datasets containing information about the day-wise past confirmed cases using ML models (SVR, PR and BRR). According to this paper, as compared to SVR and PR, BRR performed far better in the future forecasting of the infected confirmed cases owing to coronavirus. © 2022 Bentham Science Publishers.

17.
2nd Asian Conference on Innovation in Technology, ASIANCON 2022 ; 2022.
Article in English | Scopus | ID: covidwho-2136098

ABSTRACT

The variations in the price of crude oil are very erratic, nonlinear, and dynamic with a high degree of uncertainty making it much more difficult to predict accurately. As a result, the opacity and intricacy in determining the crude oil price have been a significant topic of interest for researchers. This paper develops an efficient Genetic Algorithm(GA) based fine-tuned Support Vector Regression(SVR) model for predicting crude oil prices. The strategy utilizes key economic factors that ascertain the price per barrel, which serves as the input. The NASDAQ dataset used in this work encompasses ten years of daily data. The GA technique fine-tunes the parameters of the SVR model to boost the model's ability to foresee crude oil price fluctuations. The proposed model's performance is evaluated by employing various major criteria that compare our model to its counterparts, such as SVR and Long Short-Term Memory (LSTM) approaches. In light of these criteria, the findings of root mean square error (RMSE) and mean absolute percentage error (MAPE) indicate that this model surpasses others in predicting crude oil prices more accurately. Finally, this study also analyzes the impact of persistent uncertainness concerning the COVID-19 outbreak on crude oil price trends. © 2022 IEEE.

18.
3rd International Conference on Artificial Intelligence and Data Sciences, AiDAS 2022 ; : 310-315, 2022.
Article in English | Scopus | ID: covidwho-2136076

ABSTRACT

COVID-19 has majorly impacted the world and has spread to every corner of the world. As a result, the tourism industry suffered greatly with many tourist sites having to close. Previous research has used regression models to predict the impact of COVID-19, though few has linked it to the number of tourists. This paper uses five different regression models to predict tourism rates based on multiple country's COVID-19 data. Regression models include linear regression, polynomial regression, K-Nearest Neighbors regression, random forest regression, and support vector regression. The datasets that we use are COVID-19 data that contains the number of cases and Indonesia's tourism data that contains the monthly number of incoming tourists to Indonesia from different countries. The dataset will be processed by selecting the countries with the most amount of tourist. The preprocessed dataset is divided into two for training and testing the models with an 8:2 ratio. The result from the evaluation showed that random forest regression has the highest accuracy with a R2 score of 0.9. Our research is limited to the number of datasets that are used as there might be other variables that are not considered. © 2022 IEEE.

19.
Journal of System and Management Sciences ; 12(2):174-194, 2022.
Article in English | Scopus | ID: covidwho-2026593

ABSTRACT

Coronavirus attacks have affected countless countries. The death rates between most countries are increasing day by day, and we have attempted to propose many considerations about the principal problems that cause dangerous infections across the globe. In this work, the dietary patterns of 170 countries are considered to identify correlations between diet practices and death rates, confirmed and recovered cases caused by COVID-19. We have used data from food intake by countries and data associated with the spread of COVID-19 and other health issues that help get new insights into the importance of nutrition and eating habits to combat the spreading of infectious diseases. We have built a machine learning model (regressor) such as ridge regressor, support vector regression, random forest, and XGBoost regressor to predict the mortality rate based on food intake information and Obesity. Two approaches were considered: One with all food-related features taken as parameters and a simpler one, which reduced the dimensionality by using only two features: Animal products and vegetal products. Both have issues (mainly of spread and non-linearity), but we could use different models and metrics. Next, we have built a model to predict obesity rates based on eating habits in each country. The proposed model was far more effective, and the general inclination of the information was taken and anticipated. We have also used data visualization approaches to get better insights into the data considered. © 2022, Success Culture Press. All rights reserved.

20.
3rd International Conference on Advances in Distributed Computing and Machine Learning, ICADCML 2022 ; 427:413-423, 2022.
Article in English | Scopus | ID: covidwho-2014006

ABSTRACT

Foreign Exchange (FOREX) is a decentralized global market for exchanging currencies. The Forex market is enormous, and it operates 24 h a day. Along with country-specific factors, Forex trading is influenced by cross-country ties and a variety of global events. Recent pandemic scenarios such as COVID19 and local elections can also have a significant impact on market pricing. We tested and compared various predictions with external elements such as news items in this work. Additionally, we compared classical machine learning methods to deep learning algorithms. We also added sentiment features from news headlines using NLP-based word embeddings and compared the performance. Our results indicate that simple regression model like linear, SGD, and Bagged performed better than deep learning models such as LSTM and RNN for single-step forecastings like the next two hours, the next day, and seven days. Surprisingly, news articles failed to improve the predictions indicating domain-based and relevant information only adds value. Among the text vectorization techniques, Word2Vec and SentenceBERT perform better. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

SELECTION OF CITATIONS
SEARCH DETAIL